Multi-Value-Functions: E cient Automatic Action Hierarchies for Multiple Goal MDPs

نویسندگان

  • Andrew W. Moore
  • Leemon C. Baird
  • Leslie Kaelbling
چکیده

If you have planned to achieve one particular goal in a stochastic delayed rewards problem and then someone asks about a di erent goal what should you do? What if you need to be ready to quickly supply an answer for any possible goal? This paper shows that by using a new kind of automatically generated abstract action hierarchy that with N states, preparing for all of N possible goals can be much much cheaper than N times the work of preparing for one goal. In goal-based Markov Decision Problems, it is usual to generate a policy (x), mapping states to actions, and a value function J(x), mapping states to an estimate of minimum expected cost-to-goal, starting at x. In this paper we will use the terminology that a multi-policy (x; y) (for all state-pairs (x; y)) maps a state x to the rst action it should take in order to reach y with expected minimum cost and a multi-valuefunction J(x;y) is a de nition of this minimum cost. Building these objects quickly and with little memory is the main purpose of this paper, but a secondary result is a natural, automatic, way to create a set of parsimonious yet powerful abstract actions for MDPs. The paper concludes with a set of empirical results on increasingly large MDPs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi - Value - Functions : E cient Automatic Action Hierarchies forMultiple Goal

If you have planned to achieve one particular goal in a stochastic delayed rewards problem and then someone asks about a diierent goal what should you do? What if you need to be ready to quickly supply an answer for any possible goal? This paper shows that by using a new kind of automatically generated abstract action hierarchy that with N states, preparing for all of N possible goals can be mu...

متن کامل

Basis function construction for hierarchical reinforcement learning

Much past work on solving Markov decision processes (MDPs) using reinforcement learning (RL) has relied on combining parameter estimation methods with hand-designed function approximation architectures for representing value functions. Recently, there has been growing interest in a broader framework that combines representation discovery and control learning, where value functions are approxima...

متن کامل

Multi-Value-Functions: Efficient Automatic Action Hierarchies for Multiple Goal MDPs

If you have planned to achieve one particular goal in a stochastic delayed rewards problem and then someone asks about a different goal what should you do? What if you need to be ready to quickly supply an answer for any possible goal? This paper shows that by using a new kind of automata caily generated abstract action hierarchy that with N states, preparing for all of N possible goals can be ...

متن کامل

Multi-level Association Rule Mining: an Object-oriented Approach Based on Dynamic Hierarchies

Previous studies in data mining have yielded e cient algorithms for discovering association rules. But it is well-known problem that the two controlling measures of support and con dence, when used as the sole de nition of relevant association rules, are too inclusive | interesting rules are included with many uninteresting cases. A typical approach to this problem is to augment the thresholds ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999